Playout Policy Adaptation for Games

نویسنده

  • Tristan Cazenave
چکیده

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of these games, PPA is better than UCT with a uniform random playout policy, with the notable exceptions of Go and Nogo.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nested Rollout Policy Adaptation with Selective Policies

Monte Carlo Tree Search (MCTS) is a general search algorithm that has improved the state of the art for multiple games and optimization problems. Nested Rollout Policy Adaptation (NRPA) is an MCTS variant that has found record-breaking solutions for puzzles and optimization problems. It learns a playout policy online that dynamically adapts the playouts to the problem at hand. We propose to enh...

متن کامل

Playout policy adaptation with move features

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We also propose to learn a policy not only using the moves but also according to the features of the moves. We test the resulting algorithms named Playout Policy Adaptation (PPA) and Playout Policy Adaptation with move Featur...

متن کامل

Memorizing the Playout Policy

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). Playout Policy Adaptation with move Features (PPAF) is a state of the art MCTS algorithm that learns a playout policy online. We propose a simple modification to PPAF consisting in memorizing the learned policy from one move to the next. We test PPAF with memorization (PPAFM) against PPAF and UCT fo...

متن کامل

Optimization of a packet video receiver under different levels of delay jitter: an analytical approach

This paper studies the problem of analyzing and designing optimal playout adaptation policies for packet video receivers (PVRs) that operate in a delay jitter inducing best-effort network, like the current Internet. The developed system model is built around the Ek/Di/1/N phase-type queue and allows for the effective modeling of key design and system parameters, such as: the level of delay jitt...

متن کامل

Playout Search for Monte-Carlo Tree Search in Multi-player Games

Monte-Carlo Tree Search (MCTS) has become a popular search technique for playing multi-player games over the past few years. In this paper we propose a technique called Playout Search. This enhancement allows the use of small searches in the playout phase of MCTS in order to improve the reliability of the playouts. We investigate max, Paranoid and BRS for Playout Search and analyze their perfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015